Formatting Views of Whiteboards in Conjunction with Presenters

ABSTRACT

A videoconferencing endpoint determines if there is a whiteboard and if a presenter is near the whiteboard. If there is no whiteboard in view or the presenter is not near the whiteboard, any content from a camera focused on the whiteboard is continued and any presenter framing is done normally. If the presenter is in front of the whiteboard, any whiteboard content is ended, and appropriate portions of the whiteboard are included in the main video stream framed with the presenter. If the whiteboard is empty, framing is done without reference to the whiteboard. If the whiteboard is full or has writing away from the presenter, the entire whiteboard and the presenter are framed together. If the whiteboard only has writing near the presenter, only the relevant portion of the whiteboard is framed with the presenter.

CROSS-REFERENCE

This application claims priority to U.S. Provisional Application Ser.No. 63/161,133, filed Mar. 15, 2021, the contents of which areincorporated herein in their entirety by reference.

TECHNICAL FIELD

This disclosure relates generally to videoconferencing and relatesparticularly to detection of whiteboards and individuals in one or morecaptured audio-visual streams.

BACKGROUND

Currently whiteboards are treated primarily as content sources, so thatthe whiteboard is provided as a content stream. A presenter is seen in avideo stream, even if he moves around. In some cases, a camera isdedicated to a whiteboard, but then a user must switch the video sourcebeing provided to the far end between the whiteboard and the presenter.If the presenter is standing near or in front of the whiteboard, anyframing with the whiteboard can become confusing. For example, thewhiteboard is provided in the content stream and displayed on a contentmonitor, but the whiteboard is also present in the presenter videostream and the main monitor.

BRIEF DESCRIPTION OF THE DRAWINGS

For illustration, there are shown in the drawings certain examplesdescribed in the present disclosure. In the drawings, like numeralsindicate like elements throughout. The full scope of the inventionsdisclosed herein are not limited to the precise arrangements,dimensions, and instruments shown. In the drawings:

FIG. 1 is an illustration of a conference room including multiplecameras and a whiteboard according to examples of the presentdisclosure.

FIG. 2A is an illustration of a presenter separated from a whiteboardand the resulting framing according to examples of the presentdisclosure.

FIG. 2B is an illustration of a presenter standing in front of an emptywhiteboard and the resulting framing according to examples of thepresent disclosure.

FIG. 2C is an illustration of a presenter standing in front of awhiteboard full of writing and the resulting framing according toexamples of the present disclosure.

FIG. 2D is an illustration of a presenter standing in front of awhiteboard only partially filled with writing and the resulting framingaccording to examples of the present disclosure.

FIG. 3 is a high-level flowchart of framing operations according toexamples of the present disclosure.

FIG. 4A is a flowchart of framing operations involving a presenter and awhiteboard according to examples of the present disclosure.

FIG. 4B is the flowchart of FIG. 4A with added delay periods.

FIG. 4C is the flowchart of FIG. 4A when the whiteboard is neverprovided as content.

FIG. 5 is a high-level block diagram of a videoconferencing systemaccording to examples of the present disclosure.

FIG. 6 is a more detailed block diagram of the videoconferencing systemof FIG. 5 according to examples of the present disclosure.

FIG. 7 is a block diagram of a system on a chip for use in thevideoconferencing systems of FIGS. 5 and 6.

DETAILED DESCRIPTION OF THE EMBODIMENTS

Far end viewer comprehension is improved in examples according to thepresent disclosure. A near end videoconferencing endpoint determines ifthere is a whiteboard and if a presenter is near the whiteboard. Ifthere is no whiteboard in view or the presenter is not near thewhiteboard, any content from a camera focused on the whiteboard iscontinued and any presenter framing is done normally. If the presenteris in front of the whiteboard, any whiteboard content is ended, andappropriate portions of the whiteboard are included in the main videostream framed with the presenter. If the whiteboard is empty, framing isdone without reference to the whiteboard. If the whiteboard is full orhas writing away from the presenter, the entire whiteboard and thepresenter are framed together. If the whiteboard only has writing nearthe presenter, only the relevant portion of the whiteboard is framedwith the presenter. By including the whiteboard in the framing with thepresenter and turning off any whiteboard content stream when thepresenter is near the whiteboard, the far end viewer does not see thewhiteboard in two different streams.

Referring now to FIG. 1, a conference room C configured for use invideoconferencing is illustrated. The conference room C is an exemplarynear end location. Conference room C includes a conference table 10 anda series of chairs 12A-12F. A whiteboard 16 is located on one wall ofthe conference room C. A videoconferencing endpoint 498, which includesa camera 502 to view individuals seated in the various chairs 12A-F andthe whiteboard 16 and a microphone array 504 to determine speakerdirection, is provided at one end of the conference room C. A secondcamera and microphone array combination 510 is provided on one side ofthe conference room C and has a clearer view of the whiteboard 16. Athird camera and microphone array 512 is provided on a side of theconference room C holding the whiteboard 16. A content camera 511 ismounted opposite the whiteboard 16 to capture the whiteboard 16 toprovide as a content stream. A monitor or television 506 is provided todisplay the far end conference site or sites and generally to providethe loudspeaker output.

FIG. 2A illustrates a presenter P separated from the whiteboard 16. Asthere is no overlap between a framed view F1 of the presenter P and thewhiteboard 16, the presenter P is framed normally and the content camera511 provides the view of the whiteboard 16 as content in thevideoconference. Because there is no overlap, there is no confusion by aviewer at the far end.

In FIG. 2B, the presenter P has moved to be in front of the whiteboard16. The framed view F2 of the presenter P now overlaps the whiteboard16. Therefore, the content camera 511 is no longer providing content toavoid potential viewer confusion. As the whiteboard 16 is empty, theframing view F2 of the presenter P is the same size as the framing viewF1, as the size and location are based only on the presenter P, as thereis nothing on the whiteboard 16 to display.

In FIG. 2C, the whiteboard 16 has been filled with two columns 200 and202 of writing. The presenter P has not moved from FIG. 2B. As thepresenter P is in front of a whiteboard 16 containing writing, theframing view F3 includes the presenter P and the entire whiteboard 16,as the whiteboard 16 is substantially filled with writing. No content isbeing provided from the content camera 511 as the presenter P is infront of the whiteboard 16 and the contents of the whiteboard 16 areprovided in the framing view F3.

FIG. 2D is like FIG. 2C, except that the whiteboard 16 only containswriting in the left column 200. The right portion of the whiteboard 16is empty. As the right portion of the whiteboard 16 is empty, thatportion need not be shown in framing view F4, which is based on thepresenter P and the left column 200. As with FIG. 2C, in FIG. 2D nocontent is being provided from the content camera 511.

By including the presence of the whiteboard 16 and any writing on thewhiteboard 16 into the decisions for framing the presenter P, andappropriately controlling the transmission of the whiteboard as content,viewer confusion is reduced.

Referring now to FIG. 3, a high-level flowchart 300 of camera framing bya near end videoconference endpoint is illustrated. In step 302, videostreams are received from any cameras and audio streams are receivedfrom any microphone arrays. In step 304, regions of interest arelocated. Regions of interest are objects or areas that are of interestin performing framing decisions. Regions of interest include conferenceparticipants but also objects in the conference room, such as thewhiteboard 16 or any object to which the participant's views may bedirected. In some examples according to the present disclosure, neuralnetworks are trained for face and body finding and for detecting thepresence of various objects that can be regions of interest. In thepresent examples, the objects include a whiteboard, including the amountand location of writing on the whiteboard. For a whiteboard, the outputof the neural network can include not only a bounding box for thewhiteboard but also outputs related to the amount and location ofwriting. In some examples, the amount and location of the writing isdetermined in a second neural network to simplify the training of theneural network performing the main region of interest detection. Thebounding box information for the whiteboard is provided as an input tothe specialized neural network to minimize the requirements of thespecialized neural network. In some examples, the face and body findingare performed in one neural network and other regions of interest, suchas the whiteboard, are detected in a different neural network, allowingsimplification of each neural network and reuse of existing face andbody finding neural networks. In those examples, the detection of anywriting on the whiteboard can be performed by the region of interestdetection neural network or in an additional neural network asdescribed.

In step 306, the audio streams from the microphone arrays are used forsound source localization (SSL), with the SSL results then used incombination with the video streams to find talkers. In the case of apresenter in front of a whiteboard, there is generally only a singletalker to be framed.

After the talkers are found in step 306, in step 308 the parties areframed as desired. Framing is usually based on the locations and numbersof talkers or participants to be framed. Examples according to thepresent disclosure add the location of a whiteboard into the framingconsiderations. Details of the framing according to examples of thepresent disclosure are provided in FIGS. 4A, 4B and 4C.

FIG. 4A provides details of step 308 for examples according to thepresent disclosure when a whiteboard may be involved. In step 402, adetermination is made whether the ROIs, from step 304, include awhiteboard. If so, in step 404 a determination is made whether thepresenter, the talker as determined based on SSL and video observations,is near the whiteboard. Near is effectively if the presenter is in aposition that the framed view of the presenter would include thewhiteboard. If not, or if there is no whiteboard ROI, in step 406 anycontent from a whiteboard, as from the content camera 511, is providedas content in the videoconference. Operation proceeds to step 408, wherenormal framing operations are performed, as illustrated in FIG. 2A.These normal framing operations can be rule of thirds framing, centeredframing, and the like.

If the presenter is near the whiteboard in step 404, in step 410transmission of the whiteboard as content is discontinued. In step 412,it is determined if the whiteboard is empty. If so, the whiteboard neednot be considered in framing determinations and operation proceeds tostep 408, for framing as illustrated in FIG. 2B.

If the whiteboard is not empty, in step 414 it is determined of thewhiteboard is substantially full of writing or is only in portions notadjacent the presenter. If the whiteboard is full or the writing is notadjacent the presenter, in step 416 framing is based on the presenterand the entire whiteboard, as in FIG. 2C. If the whiteboard is onlypartially filled and the portion is adjacent the presenter, in step 418framing is based on the presenter and just the portion of the whiteboardcontaining the writing, as in FIG. 2D.

In a simplified example, there is no evaluation of the amount orlocation of any writing on the whiteboard and the presenter is simplyframed with the entire whiteboard when the presenter is near thewhiteboard, so that the framing is as shown in FIG. 2C even if there isno writing or the writing is adjacent the presenter.

If the presenter is pacing, so that the whiteboard comes into and out ofa framing view of the presenter, a situation might arise where thewhiteboard content stream is rapidly and repeatedly turned on and off.This would be distracting to the viewer at the far end, so in someexamples time delays are included after the determination of step 404 asshown in FIG. 4B. If the talker is not near or away the whiteboard instep 404, in step 450 it is determined if the talker has been away fromthe whiteboard for a desired period, such as five seconds. If so, thenthe whiteboard is provided as content in step 408. If the talker hasbeen not away for the desired period, operation proceeds to step 410,with the whiteboard remaining discontinued as content.

If the talker is near the whiteboard in step 404, in step 452 it isdetermined if the talker has been near the whiteboard for a desiredperiod, such as five seconds. If so, operation proceeds to step 410 andthe provision of the whiteboard as content is discontinued. If thedesired period has not elapsed, operation proceeds to step 406, wherethe whiteboard continues to be provided as content.

Operation is similar even if the whiteboard is never provided ascontent, such as when there is no camera aimed at the whiteboard tooperate as a content camera. This operation is shown in FIG. 4C. FIG. 4Cis FIG. 4A with steps 406 and 410 removed as the whiteboard is never tobe provided as content. A similar modification can be done to FIG. 4B toinclude the time delays of FIG. 4B.

While whiteboards have been discussed above, it is understood that otherobjects are similar to whiteboards, so the term whiteboard as usedherein is not limited to just dry erase whiteboards per se but includessimilar items, such as smart or interactive whiteboards, flip charts,extra-large sticky notes, bulletin boards with paper on them, boards(including Kanban boards and scrum boards), clusters of sticky notes, awall with a projected image from an interactive projector, etc., all ofwhich are broadly considered as interactive group presentation devices.

While writing on the whiteboard has been discussed above, it isunderstood that writing is used broadly, so that other informationbesides the illustrated textual information, such as graphicalinformation, pre-printed materials, etc. placed on or displayed by thewhiteboard are classified as writing, all of which are broadlyconsidered as information.

In the examples of this disclosure, a content camera 511 has beendescribed as capturing the whiteboard to be provided as content. If thewhiteboard is a smart or interactive whiteboard, the whiteboard itselfmay be providing the content image. If the whiteboard is an imageprojected by an interactive projector, the projector may be providingthe content image. The transmission of the content image in either casewould be controlled as described in FIGS. 4A and 4B, just in cooperationwith the smart whiteboard or interactive projector instead of thecontent camera 511.

While the use of neural networks has been described to determine thepresence of a whiteboard and the amount of writing on a whiteboard, itis understood that more conventional computer vision techniques can alsobe used.

In examples according to the present disclosure, the camera with thebest view of the presenter P and whiteboard 16 is used for the framingoperations and then transmitted to the far end. For example, in FIG. 1that camera would be camera 510, absent a participant standing in frontof the camera 510.

While this disclosure has focused on the use of a whiteboard in aconference room, it is understood that the whiteboard and presenter maybe in many different settings, including a classroom, an auditorium, alecture hall, a theater and so on.

Additionally, while the whiteboard 16 has been shown mounted on a wall,the whiteboard may also be freestanding or a portion of another object.

By including the whiteboard into presenter or talker framing decisionswhen the presenter is near or in front of the whiteboard, the experienceof viewers at the far end is improved as confusion with provision of thewhiteboard as content is reduced, particularly if the provision ofwhiteboard content is coordinated with the presenter framing decisionsso that the whiteboard is not presented in both normal video stream andthe content stream at the same time.

FIG. 5 illustrates an exemplary videoconferencing endpoint 498 as usedat a near end or a far end according to the present disclosure. A codec500, the processing unit of the videoconferencing endpoint 498, performsthe necessary processing. In the illustrated example, a camera 502 and amicrophone array 504 are included in the codec 500 to form an integratedunit, such as a bar. An external microphone 508 is connected to thecodec 500 to be used on a conference room table. Cameras 510 and 512,which include integrated microphone arrays, are connected to the codec500 to provide alternate or additional views or video streams. A contentcamera 511 is connected to the codec 500 to provide a content stream foruse in the videoconference. A television or monitor 506, including aspeaker, is also connected to the codec 500 to provide video and audiooutput. Additional monitors can be used if desired to provide greaterflexibility in displaying conference participants and conferencecontent.

The codec 500 is connected to a corporate or other local area network(LAN) 514. The corporate LAN 514 is connected to a firewall 516 and thenthe Internet 518 in a common configuration to allow communication with aremote endpoint 634 at a far end.

Details of the codec 500 are shown in FIG. 6. In the illustratedexample, a system on chip (SoC) 600 is the primary component of thecodec 500. The SoC 600 is similar to those used for cellular telephonesand handheld equipment, such as a Tegra X1 or Qualcomm 835. The SoC 600may be included as the main component on a system on module (SOM), suchas nVidia™ Jetson TX1 or Intrinsyc™ Open-Q™ 835 System on Module. TheSoC 600 contains the CPUs 601, DSP(s) 602, a GPU 606, onboard RAM 608, avideo encode and decode module 614, an HDMI output module 616, a camerainputs module 618, a DRAM interface 610, a flash memory interface and anI/O module 622. The I/O module 622 provides audio inputs and outputs,such as I2S signals; USB interfaces; an SDIO interface; PCIe interfaces;an SPI interface; an I2C interface and various general purpose I/O pins(GPIO).

Cameras 510, 512 and content camera 511 are connected to the camerainputs module 618. The monitor and speaker 506 is connected to the HDMIoutput module 616. External DRAM 612 and a Wi-Fi/Bluetooth module 620are connected to the SoC 600 to provide the needed bulk operating memory(RAM associated with each CPU and DSP is not shown) and additional I/Ocapabilities commonly used today. An audio codec 624 is connected to theSoC 600 to provide local analog line level capabilities. An analogmicrophone 508 is connected to the audio codec 624.

Preferably two network interface chips (NICs) 626, 628, such as IntelI210, are connected to the PCIe interfaces of the SoC 600. In theillustrated embodiment, NIC 626 is for connection to the corporate LAN514 and then to IP microphones 632, the Internet 518 and remote or farend endpoints 634, while the other NIC 628 is used for local connectionof IP-connected devices, such as IP microphones 630.

Flash memory 604 is connected to the SoC 600 to hold the programs thatare executed by the CPUs 601 and DSPs 602 to provide the endpointfunctionality of the codec 500, including the whiteboard and presenterframing discussed above. Illustrated modules include a video codec 650,camera control 652, face, body and ROI finding 653, neural networkmodels 655, framing 654, other video processing 656, audio codec 658,audio processing 660, sound source localization 661, network operations666, user interface 668 and operating system and various other modules670. The RAM 608 and DRAM 612 is used for storing any of the modules inthe flash memory 604 when the module is executing, storing video imagesof video streams and audio samples of audio streams and can be used forscratchpad operation of the SoC 600. The neural network models 855 andface, body and ROI finding 853 are used with the framing 654 to performthe whiteboard and presenter detection and framing as described abovefor FIGS. 3 and 4 and illustrated in FIGS. 2A-2D.

FIG. 7 is a block diagram of an exemplary system on a chip (SoC) 700 ascan be used as the SoC 600 in the codec 500. A series of more powerfulmicroprocessors 702, such as ARM® A72 or A53 cores, form the CPUs 601 orprimary general purpose processing block of the SoC 700, while a morepowerful digital signal processor (DSP) 704 and multiple less powerfulDSPs 705, together the DSPs 602, provide specialized computingcapabilities. A simpler processor 706, such as ARM RSF cores, providesgeneral control capability in the SoC 700. The more powerfulmicroprocessors 702, more powerful DSP 704, less powerful DSPs 705 andsimpler processor 706 each include various data and instruction caches,such as L1I, L1D, and L2D, to improve speed of operations. A high speedinterconnect 708 connects the microprocessors 702, more powerful DSP704, simpler DSPs 705 and processors 706 to various other components inthe SoC 700. For example, a shared memory controller 710, which includesonboard memory or SRAM 608, is connected to the high speed interconnect708 to act as the onboard SRAM for the SoC 700. A DDR (double data rate)memory controller system 714 is connected to the high speed interconnect708 and acts as an external interface to external DRAM memory. A videoacceleration module 716 and a radar processing accelerator (PAC) module718 are similarly connected to the high speed interconnect 708. A neuralnetwork acceleration module 717 is provided for hardware acceleration ofneural network operations. A vision processing accelerator (VPACC)module is the video encoder/decoder 614 and is connected to the highspeed interconnect 708, as is a depth and motion PAC (DMPAC) module 722.

A graphics acceleration module 724 is connected to the high speedinterconnect 708. A display subsystem as the HDMI output 616 isconnected to the high speed interconnect 708 to allow operation with andconnection to various video monitors. A system services block 732, whichincludes items such as DMA controllers, memory management units, generalpurpose I/O's, mailboxes, and the like, is provided for normal SoC 700operation. A serial connectivity module 734 is connected to the highspeed interconnect 708 and includes modules as normal in an SoC. Aconnectivity module 736 provides interconnects for externalcommunication interfaces, such as PCIe block 738, USB block 740 and anEthernet switch 742. A capture/MIPI module is the camera interface 618and includes a four lane CSI 2 compliant transmit block 746 and a fourlane CSI 2 receive module and hub.

An MCU island 760 is provided as a secondary subsystem and handlesoperation of the integrated SoC 700 when the other components arepowered down to save energy. An MCU ARM processor 762, such as one ormore ARM R5F cores, operates as a master and is coupled to the highspeed interconnect 708 through an isolation interface 761. An MCUgeneral purpose I/O (GPIO) block 764 operates as a slave. MCU RAM 766 isprovided to act as local memory for the MCU ARM processor 762. A CAN busblock 768, an additional external communication interface, is connectedto allow operation with a conventional CAN bus environment in a vehicle.An Ethernet MAC (media access control) block 770 is provided for furtherconnectivity. External memory, generally non volatile memory (NVM) suchas flash memory 604, is connected to the MCU ARM processor 762 via anexternal memory interface 769 to store instructions loaded into thevarious other memories for execution by the various appropriateprocessors. The MCU ARM processor 762 operates as a safety processor,monitoring operations of the SoC 700 to ensure proper operation of theSoC 700.

It is understood that this is one example of an SoC provided forexplanation and many other SoC examples are possible, with varyingnumbers of processors, DSPs, accelerators and the like.

A system of one or more computers can be configured to performparticular operations or actions by virtue of having software, firmware,hardware, or a combination of them installed on the system that inoperation causes or cause the system to perform the actions. One or morecomputer programs can be configured to perform particular operations oractions by virtue of including instructions that, when executed by dataprocessing apparatus, cause the apparatus to perform the actions. Onegeneral aspect includes a method of presenting a talker and a whiteboardto a far end of a videoconference. The method also includes receiving atleast one video stream containing both the talker and the whiteboard.The method also includes determining the presence of the talker near thewhiteboard. The method also includes when the talker is near thewhiteboard, framing the talker and the whiteboard together for provisionto the far end. Other embodiments of this aspect include correspondingcomputer systems, apparatus, and computer programs recorded on one ormore computer storage devices, each configured to perform the actions ofthe methods.

Implementations may include one or more of the following features. Themethod may include determining the presence of writing on thewhiteboard, and where framing the talker and the whiteboard together isperformed only when there is writing on the whiteboard. Determining thepresence of writing on the whiteboard includes determining that thewriting only partially fills the whiteboard and the writing is adjacentto the talker, and where framing the talker and the whiteboard togetherframes the talker and only the portion of the whiteboard adjacent to thetalker containing the writing when the writing only partially fills thewhiteboard and the writing is adjacent to the talker. Determining thepresence of writing on the whiteboard includes determining that thewriting fills the whiteboard, and where framing the talker and thewhiteboard together frames the talker and the entire whiteboard when thedetermining the presence of writing on the whiteboard determines thatthe writing fills the whiteboard. The method the near end environmentfurther containing a camera for providing a view of the whiteboard ascontent in the videoconference, the method may include: discontinuingprovision of the whiteboard as content when the talker and thewhiteboard are framed together. The method may include continuingprovision of the whiteboard as content when the talker is not near thewhiteboard. Determining the presence of the talker near the whiteboardincludes detecting regions of interest in the at least one video stream;and determining if a region of interest is a whiteboard. Implementationsof the described techniques may include hardware, a method or process,or computer software on a computer-accessible medium.

The above description is intended to be illustrative, and notrestrictive. For example, the above-described embodiments may be used incombination with each other. Many other embodiments will be apparent tothose of skill in the art upon reviewing the above description. Thescope of the invention should, therefore, be determined with referenceto the appended claims, along with the full scope of equivalents towhich such claims are entitled. In the appended claims, the terms“including” and “in which” are used as the plain-English equivalents ofthe respective terms “comprising” and “wherein.”

1. A method of presenting a talker and a whiteboard to a far end of avideoconference, the near end environment containing the talker, thewhiteboard and at least one video camera for providing a video stream tothe far end, the at least one video camera having both the talker andthe whiteboard within its field of view, the method comprising:receiving at least one video stream containing both the talker and thewhiteboard; determining the presence of the talker near the whiteboard;and when the talker is near the whiteboard, framing the talker and thewhiteboard together for provision to the far end.
 2. The method of claim1, further comprising: determining the presence of writing on thewhiteboard, and wherein framing the talker and the whiteboard togetheris performed only when there is writing on the whiteboard.
 3. The methodof claim 2, wherein determining the presence of writing on thewhiteboard includes determining that the writing only partially fillsthe whiteboard and the writing is adjacent to the talker, and whereinframing the talker and the whiteboard together frames the talker andonly the portion of the whiteboard adjacent to the talker containing thewriting when the writing only partially fills the whiteboard and thewriting is adjacent to the talker.
 4. The method of claim 2, whereindetermining the presence of writing on the whiteboard includesdetermining that the writing fills the whiteboard, and wherein framingthe talker and the whiteboard together frames the talker and the entirewhiteboard when the determining the presence of writing on thewhiteboard determines that the writing fills the whiteboard.
 5. Themethod of claim 1, the near end environment further containing a camerafor providing a view of the whiteboard as content in thevideoconference, the method further comprising: discontinuing provisionof the whiteboard as content when the talker and the whiteboard areframed together.
 6. The method of claim 5, further comprising:continuing provision of the whiteboard as content when the talker is notnear the whiteboard.
 7. The method of claim 1, wherein determining thepresence of the talker near the whiteboard includes: detecting regionsof interest in the at least one video stream; and determining if aregion of interest is a whiteboard.
 8. A videoconference endpoint foruse in a near end environment containing a talker, an interactive grouppresentation device and at least one video camera for providing a videostream to a far end videoconference endpoint, the at least one videocamera having both the talker and the interactive group presentationdevice within its field of view, comprising: a processor; a networkinterface coupled to the processor for connection to a far endvideoconference endpoint; a camera interface coupled to the processorfor receiving at least one video stream having both the talker and theinteractive group presentation device; a video output interface coupledto the processor for providing a video stream to a display forpresentation; and memory coupled to the processor for storinginstructions executed by the processor to perform the operations of:receiving at least one video stream containing both the talker and theinteractive group presentation device; determining the presence of thetalker near the interactive group presentation device; and when thetalker is near the interactive group presentation device, framing thetalker and the interactive group presentation device together forprovision to the far end.
 9. The videoconference endpoint of claim 8,the memory further storing instructions executed by the processor toperform the operations of: determining the presence of information onthe interactive group presentation device, and wherein framing thetalker and the interactive group presentation device together isperformed only when there is information on the interactive grouppresentation device.
 10. The videoconference endpoint of claim 9,wherein determining the presence of information on the interactive grouppresentation device includes determining that the information onlypartially fills the interactive group presentation device and theinformation is adjacent to the talker, and wherein framing the talkerand the interactive group presentation device together frames the talkerand only the portion of the interactive group presentation deviceadjacent to the talker containing the information when the informationonly partially fills the interactive group presentation device and theinformation is adjacent to the talker.
 11. The videoconference endpointof claim 9, wherein determining the presence of information on theinteractive group presentation device includes determining that theinformation fills the interactive group presentation device, and whereinframing the talker and the interactive group presentation devicetogether frames the talker and the entire interactive group presentationdevice when the determining the presence of information on theinteractive group presentation device determines that the informationfills the interactive group presentation device.
 12. The videoconferenceendpoint of claim 8, the near end environment further containing acamera for providing a view of the interactive group presentation deviceas content in the videoconference, the memory further storinginstructions executed by the processor to perform the operations of:discontinuing provision of the interactive group presentation device ascontent when the talker and the interactive group presentation deviceare framed together.
 13. The videoconference endpoint of claim 12, thememory further storing instructions executed by the processor to performthe operations of: continuing provision of the interactive grouppresentation device as content when the talker is not near theinteractive group presentation device.
 14. The videoconference endpointof claim 8, wherein determining the presence of the talker near theinteractive group presentation device includes: detecting regions ofinterest in the at least one video stream; and determining if a regionof interest is an interactive group presentation device.
 15. Anon-transitory processor readable memory containing instructions thatwhen executed cause a processor or processors to perform the followingmethod of framing a talker, the near end environment containing atalker, a whiteboard and at least one video camera for providing a videostream to a far end, the at least one video camera having both thetalker and the whiteboard within its field of view, the methodcomprising: receiving at least one video stream containing both thetalker and the whiteboard; determining the presence of the talker nearthe whiteboard; and when the talker is near the whiteboard, framing thetalker and the whiteboard together for provision to the far end.
 16. Thenon-transitory processor readable memory of claim 15, the method furthercomprising: determining the presence of writing on the whiteboard, andwherein framing the talker and the whiteboard together is performed onlywhen there is writing on the whiteboard.
 17. The non-transitoryprocessor readable memory of claim 16, wherein determining the presenceof writing on the whiteboard includes determining that the writing onlypartially fills the whiteboard and the writing is adjacent to thetalker, and wherein framing the talker and the whiteboard togetherframes the talker and only the portion of the whiteboard adjacent to thetalker containing the writing when the writing only partially fills thewhiteboard and the writing is adjacent to the talker.
 18. Thenon-transitory processor readable memory of claim 16, whereindetermining the presence of writing on the whiteboard includesdetermining that the writing fills the whiteboard, and wherein framingthe talker and the whiteboard together frames the talker and the entirewhiteboard when the determining the presence of writing on thewhiteboard determines that the writing fills the whiteboard.
 19. Thenon-transitory processor readable memory of claim 15, the near endenvironment further containing a camera for providing a view of thewhiteboard as content in the videoconference, the method furthercomprising: discontinuing provision of the whiteboard as content whenthe talker and the whiteboard are framed together.
 20. Thenon-transitory processor readable memory of claim 19, the method furthercomprising: continuing provision of the whiteboard as content when thetalker is not near the whiteboard.